Make qwen3's set_embed_and_head idempotent by fzyzcjy · Pull Request #26798 · sgl-project/sglang

fzyzcjy · 2026-05-31T01:45:31Z

Guard the weight deletions in Qwen3ForCausalLM.set_embed_and_head with
hasattr checks so the method is idempotent: calling it when
embed_tokens.weight / lm_head.weight have already been deleted (e.g. a
second invocation, or after a prior tie/share step) no longer raises
AttributeError.

CI States

Latest PR Test (Base): ❌ Run #26700294436
Latest PR Test (Extra): ❌ Run #26700294381

Guard the weight deletions in `Qwen3ForCausalLM.set_embed_and_head` with `hasattr` checks so the method is idempotent: calling it when `embed_tokens.weight` / `lm_head.weight` have already been deleted (e.g. a second invocation, or after a prior tie/share step) no longer raises `AttributeError`.

gemini-code-assist

Code Review

This pull request modifies the set_embed_and_head method in python/sglang/srt/models/qwen3.py to check for the existence of the weight attribute before deleting it. The reviewer suggested a more robust implementation to handle pipeline parallelism and tied word embeddings, recommending that the code check if the layers are instances of PPMissingLayer and avoid redundant operations when self.lm_head is the same object as self.model.embed_tokens to prevent unnecessary memory usage.

gemini-code-assist · 2026-05-31T01:46:29Z

+        if hasattr(self.model.embed_tokens, "weight"):
+            del self.model.embed_tokens.weight
+        if hasattr(self.lm_head, "weight"):
+            del self.lm_head.weight
        self.model.embed_tokens.weight = embed
        self.lm_head.weight = head


When pipeline parallelism is enabled (pp_size > 1), ranks other than the first rank will have self.model.embed_tokens as a PPMissingLayer, and ranks other than the last rank will have self.lm_head as a PPMissingLayer.

Additionally, when tie_word_embeddings is enabled, self.lm_head is the exact same object as self.model.embed_tokens.

The current implementation:

Unnecessarily assigns weight to PPMissingLayer placeholders on ranks where they are not used, which can prevent the large embed or head tensors from being garbage collected on those ranks.

Performs redundant deletion and setting operations when weights are tied.

We can make this method fully robust and memory-efficient by checking if the modules are instances of PPMissingLayer and ensuring we don't perform redundant operations when self.lm_head is self.model.embed_tokens.

Suggested change

if hasattr(self.model.embed_tokens, "weight"):

del self.model.embed_tokens.weight

if hasattr(self.lm_head, "weight"):

del self.lm_head.weight

self.model.embed_tokens.weight = embed

self.lm_head.weight = head

if not isinstance(self.model.embed_tokens, PPMissingLayer):

if hasattr(self.model.embed_tokens, "weight"):

del self.model.embed_tokens.weight

self.model.embed_tokens.weight = embed

if not isinstance(self.lm_head, PPMissingLayer) and self.lm_head is not self.model.embed_tokens:

if hasattr(self.lm_head, "weight"):

del self.lm_head.weight

self.lm_head.weight = head

fzyzcjy · 2026-05-31T01:52:31Z

Reviewed for correctness — looks good to me. ✅

The change guards the two del statements in set_embed_and_head with hasattr checks, which correctly achieves the stated idempotency:

hasattr(module, "weight") is the right check here — nn.Module.__getattr__ raises AttributeError for an unregistered parameter, so hasattr returns False once the weight has been deleted, and the del is safely skipped.
Calling the method twice now yields the same result as calling it once (the re-assignment + empty_cache/synchronize are harmless to repeat).
It also makes the PP non-last-rank case (where lm_head is a PPMissingLayer without a weight) more robust instead of raising.
The assignment path is unchanged, so there's no behavior regression.

Minor (non-blocking, out of scope): the same set_embed_and_head exists in ~40 other models (llama.py, deepseek_v2.py, gemma4_mtp.py, ...) all still using bare del — they likely share the same latent issue and could get the same guard later.

Note: both PR Test CI badges are currently red; worth confirming the failures are unrelated to this trivial change before merging.

gemini-code-assist Bot reviewed May 31, 2026

View reviewed changes

fzyzcjy merged commit 13ca55a into main May 31, 2026
86 of 99 checks passed

fzyzcjy deleted the tom/pr_chain/tom/kv_canary_revert_reversed/make-qwen3-s-set-embed-and-head-idempotent branch May 31, 2026 01:47

fzyzcjy mentioned this pull request May 31, 2026

Sandbox: verify full main CI is green on latest main (do not merge) #25647

Closed

4 tasks

JEF1056 pushed a commit to JEF1056/sglang-turboquant that referenced this pull request Jun 1, 2026

Make qwen3's set_embed_and_head idempotent (sgl-project#26798)

e0e0ace

JEF1056 pushed a commit to JEF1056/sglang-turboquant that referenced this pull request Jun 1, 2026

Make qwen3's set_embed_and_head idempotent (sgl-project#26798)

77393ec

xjpang pushed a commit to xjpang/sglang that referenced this pull request Jun 2, 2026

Make qwen3's set_embed_and_head idempotent (sgl-project#26798)

441aa89

mqhc2020 pushed a commit to mqhc2020/sglang that referenced this pull request Jun 2, 2026

Make qwen3's set_embed_and_head idempotent (sgl-project#26798)

c5ca175

hanming-lu pushed a commit that referenced this pull request Jun 3, 2026

Make qwen3's set_embed_and_head idempotent (#26798)

cfa9d36

alphabetc1 pushed a commit to alphabetc1/sglang that referenced this pull request Jun 4, 2026

Make qwen3's set_embed_and_head idempotent (sgl-project#26798)

e3cfc78

jeynmann pushed a commit to jeynmann/sglang that referenced this pull request Jun 4, 2026

Make qwen3's set_embed_and_head idempotent (sgl-project#26798)

578a924

edwingao28 pushed a commit to edwingao28/sglang that referenced this pull request Jun 7, 2026

Make qwen3's set_embed_and_head idempotent (sgl-project#26798)

78da23c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make qwen3's set_embed_and_head idempotent#26798

Make qwen3's set_embed_and_head idempotent#26798
fzyzcjy merged 1 commit into
mainfrom
tom/pr_chain/tom/kv_canary_revert_reversed/make-qwen3-s-set-embed-and-head-idempotent

fzyzcjy commented May 31, 2026 •

edited by github-actions Bot

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 31, 2026

Uh oh!

Uh oh!

fzyzcjy commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fzyzcjy commented May 31, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI States

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fzyzcjy commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fzyzcjy commented May 31, 2026 •

edited by github-actions Bot

Loading